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Abstract 

This paper investigates the dissemination of multiple pieces of information in large 
networks where users contact each other in a random uncoordinated manner, and users 
upload one piece per unit time. The underlying motivation is the design and analysis 
of piece selection protocols for peer-to-peer networks which disseminate files by dividing 
them into pieces. We first investigate one-sided protocols, where piece selection is based on 
the states of either the transmitter or the receiver. We show that any such protocol relying 
only on pushes, or alternatively only on pulls, is inefficient in disseminating all pieces to 
all users. We propose a hybrid one-sided piece selection protocol - INTERLEAVE - and 
show that by using both pushes and pulls it disseminates k pieces from a single source 
to n users in W(k + logra) time, while obeying the constraint that each user can upload 
at most one piece in one unit of time, with high probability for large n. An optimal, 
unrealistic centralized protocol would take k + log 2 n time in this setting. Moreover, 
efficient dissemination is also possible if the source implements forward erasure coding, 
and users push the latest-released coded pieces (but do not pull) . We also investigate two- 
sided protocols where piece selection is based on the states of both the transmitter and 
the receiver. We show that it is possible to disseminate n pieces to n users in n-l-O(logn) 
time, starting from an initial state where each user has a unique piece. 



1 Introduction 



Peer-to-peer systems are decentralized networks enabling users to contribute resources for mu- 
tual benefit. One of the main applications of such networks is the cost-effective distribution of 
bandwidth-intensive content from one source, or a few sources, to many users simultaneously. 
Peer-to-peer networks such as eDonkey and Bit Torrent, which routinely serve files hundreds 
of megabytes in length to thousands of users, now account for a sizable share of all Internet 
traffic [1]. Examples of content distribution systems leveraging end-user resources are [2-5]. 
The service capacity in such systems can grow with the number of users, making them scalable 
and efficient in servicing a large number of users [6, 7]. 
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File dissemination networks can be broadly categorized into structured and unstructured 
networks. Structured networks such as [3-5] rely on a specific structured pattern of intercon- 
nections among users to deliver the advantages of scalability and robustness. The structured 
pattern is first set up in a decentralized fashion, and data is then disseminated using this in- 
frastructure. Unstructured networks have minimal infrastructure, and instead rely on random- 
ization to provide load balancing, robustness, and scalability. For example, in the Bit Torrent 
[2] system the only available infrastructure is a tracker of the addresses of users interested in 
obtaining the file. Each user acquires a random list of other users from the tracker, who become 
neighbors. Each user's actions are based on local information obtained from its neighbors. 

This paper investigates data dissemination in unstructured networks. Initial unstructured 
approaches [8,9] advocated uploading the whole file at one go. Users receiving the complete 
file would then upload it to other users chosen at random. These protocols were motivated in 
part by earlier theoretical work on random gossip models [10, 11] and epidemics [12]. However, 
for large files, making users wait to receive the entire file before they can start serving it 
becomes untenable for two reasons: (a) file transfer may take a long time, and during this 
time the upload capacity of downloading users is wasted, and (b) users who have received the 
file may depart before uploading a complete copy, resulting in the complete file being lost to 
others. Modern peer-to-peer file dissemination protocols such as Bit Torrent take the following 
alternate approach to speed up dissemination: the file is divided into pieces, and users can start 
serving individual pieces once they are received, instead of waiting to obtain the entire file. 

When the file to be disseminated is divided into multiple pieces, each user has to carry out 
the task of piece selection: deciding which particular piece of the file is to be communicated 
at any given time, based on local information 1 . These local decisions have a significant impact 
on the global effectiveness of file dissemination, as the spread of one piece interacts with the 
spread of other pieces. The motivations of this paper are (1) to gain a quantitative analytical 
understanding of how splitting a file speeds up its dissemination in networks with random 
user contacts, and (2) to understand how the users' local piece selection decisions impact the 
dissemination of multiple pieces. 

In Section 2, we develop a simple model of a peer-to-peer system relaying multiple pieces. 
We also explain how it captures the speedup obtained from file splitting, and enables us to 
compare the efficiency of various piece selection protocols. The user contact model is the same 
as in the classical random gossip process literature [8-11]. In Section 3 we state our main 
results, and discuss their implications. Sections 4 and 5 contain the lemmas and proofs of the 
main theorems. Section 6 contains some simulation results on protocol performance when some 
of the assumptions made in the model are relaxed. 

1 In Bit Torrent for example, this decision is made by each user based only on the information of its neighbours. 
Each peer polls its neighbors for their piece collections, and then downloads the locally rarest piece, i.e. the 
piece that has the lowest representation in the peer's neighbors. 
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2 Framework 



We now present our peer-to-peer system model. A real-world deployed peer-to-peer network 
such as Bit Torrent is an immensely complex system to model and analyze exactly, and we 
simplify some aspects for the purposes of tractability. In the following we first describe our 
framework, and then discuss the assumptions and approximations made. 

Consider a network with n users, each of whom wants to receive an entire copy of a file. The 
file is divided into pieces. All users have the same upload bandwidth, and time is measured in 
slots. The length of each slot is the time one user takes to upload one piece. Any user receiving 
a piece in some slot t can upload that piece to other users beginning in slot t + 1. Thus the 
spread of one piece interacts with the spread of other pieces via the bandwidth constraint. 

Throughout this paper, it is assumed that the users contact each other in the following 
manner: in each time slot each user chooses a target, which is another user chosen uniformly 
at random from the entire network, independently of any state, history, or other users' choices. 
Communication in that time slot occurs only between each user and its target. This is the 
contact and communication model used in the classical single-piece random gossip literature 
[8-11]. We provide bounds and performance guarantees that hold with high probability for 
large n. Our work goes beyond the classical random gossip literature in that it investigates the 
simultaneous spread of multiple pieces. 

Once a target is chosen, the user undertakes one of the following two possible actions 

• pull: the user selects a piece it does not currently possess and requests it from the target. 

• push: the user selects a piece it possesses, and transmits it to the target. 

For either of the two actions above, the user needs to make a piece selection. This piece selection 
is said to be one-sided if it is based only on the user's own current state, and not on the state of 
the target. The piece selection is said to be two-sided if it is based on the current states of both 
the user and the target. In either case the selection is independent of system history or the 
states of other users. Different ways of making this choice correspond to different protocols. In 
this paper we evaluate the performance of the protocols as measured by the completion time, 
which is the first time slot that all users have all pieces. 

Users have limited upload bandwidth. In this paper this is represented by either a hard 
constraint or a soft constraint. In the former, each user can upload at most one piece in any 
instant of time, while in the latter a user is allowed to upload (potentially) any number of 
pieces simultaneously. The fact that targets are chosen uniformly at random means that a 
network with n users most likely has a maximum loading of at most logn for the case of soft 
constraints. Soft constraints have previously been analyzed in the random gossip literature, 
see e.g. [9, 13]. In our work we do not impose any constraints on the download bandwidth of 
the users. However, due to random user contacts and upload constraints, the average usage of 
download bandwidth is still one piece per slot. 

The following simple calculation, similar to the one in [6], demonstrates the potential 
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speedup that can be had from file splitting. Consider an initial condition where the file is 
divided into k pieces, and all are initially present at one user, called the source. It is easy to 
see that the completion time would be at least k + log 2 n time slots, because it takes at least 
k slots for the last piece to emerge from the source, and a further log 2 n slots for that piece to 
reach all users. This k + log 2 n lower bound holds even for systems that employ coding. It has 
been shown in [14], under the same upload and download constraints as in this paper, that a 
fully centralized scheme can achieve this bound for all n and k. Other related work, including 
[15, 16], with slightly different communication constraints, also points to optimal dissemination 
times that are close to k + log 2 n. If, however, the file is not divided into pieces, each user can 
upload data only after receiving a copy of the entire file, which takes k time slots. In this case, 
complete file dissemination takes at least k log 2 n time slots, because the number of users having 
the file can at most double every k time slots. If we denote the ratio of the dissemination time 
for an unsplit file to the dissemination time of a file split into k pieces as the speedup achieved 
by splitting the file into k pieces, then 

, ,. . i Hog 2 n 

optimal splitting speedup = 

k + log 2 n 

From the above we see that if a decentralized protocol with random user contacts has a com- 
pletion time close to k + logn then its performance is close to that of a centralized optimal 
protocol, while if its completion time is closer to klogn then it is performing badly, providing 
little speedup from file splitting. 

For large networks, splitting a file into a large number of pieces gives significant speedup 
gains, but at the expense of increased overhead. For example, two-sided protocols may require 
users to maintain the current states of their neighbors. This may be hard when there are a 
large number of small pieces. This is the motivation for investigating piece-selection protocols 
relying on less than complete information, of which one-sided protocols represent an extreme 
case. Other overheads arise from network and system considerations, such as the choice of the 
underlying transport control protocol. 

We now briefly discuss the modeling approximations made. Network effects including 
delays, packet losses due to congestion, and user heterogeneity have been abstracted away: 
we assume that communicating a piece always takes the same amount of time, between any 
two users. Also, real-world systems are typically open, with users joining and leaving, while 
our analysis assumes the simultaneous arrival of a large number of users who are present until 
system completion. The model would be a reasonable approximation for the servicing of a 
flash crowd scenario, in which a large number of users arrive almost simultaneously, which tests 
the scalability and efficiency of any file dissemination system most severely. However, different 
models may be required for other situations. Also, each user may only have a limited view of 
the network, and may not be able to contact users chosen uniformly at random from the entire 
network. We relax this last assumption using simulations in Section 6. 

Finally, an important component of any peer-to-peer system is the incentive mechanism 
used to ensure users do not leech off the system. In this work we do not investigate incen- 
tives, but comment that it may be possible to design token-based incentive schemes that are 
compatible with our piece selection protocols. 
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3 Our Results: Protocols and their Performance 



The main contributions of this work are outlined in this section. The piece selection protocols 
are described, along with corresponding performance bounds, in the following order: one-sided 
protocols using only pull, one-sided protocols using only push, the new hybrid one-sided protocol 
INTERLEAVE, and a two-sided protocol. 

In our investigation of one-sided protocols, we assume that the file is divided into k — k(n) 
pieces, where k(n) is at most a polynomial function of n, and present results that hold with 
high probability for large n. Thus the relative number of pieces and users is allowed to vary 
over a broad range. 

One-sided pull-based protocols are those where all communication occurs only via pulls, 
and piece selection is one-sided. Two examples within this class of protocols are: 



RANDOM PULL: In each slot each user requests a random piece from the set 
of pieces it does not possess. 



SEQUENTIAL PULL: Pieces are numbered l,...,k, and in each slot each 
user pulls the lowest numbered piece it does not possess. 



The following three theorems hold for any one-sided pull-based protocol, and for hard or 
soft constraints. The first is a negative result showing that the time needed to disseminate a 
fixed fraction of the pieces to a small fraction of users grows as the product of the number of 
pieces k and logn. This means that using one-sided pull during initial dissemination fails to 
exploit the potential speedup due to file splitting. 

Theorem 1 For any < (3 < 1, starting with k pieces in one user each, let T$ be the time taken 
till at least (3k pieces are present in users each, using any one-sided pull-based protocol. 
Then, for any e > 0, 

P [T p > (3(1- e)k\ogn] > l-n- c 
for all c > 0, k at most polynomial in n, and n large enough. 



The next theorem shows that, starting from a state where each piece is present in a fraction 
of the nodes, any pull-based protocol delivers all pieces to all users in 0(k + logn) time, with 
high probability. Thus, pull finishes dissemination within a constant factor of the time needed 
by the optimal centralized protocol. 



Theorem 2 Let < rj < 1 and consider any pull-based protocol. From a state such that each 
piece is in rjn users, if T v is the time till all users have all pieces then 



r , , Mi + *) 
" " 1 iog(i + i) 



k + 



1 + c 



log(l + I) 



logn 



> 1 -n 
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for all c > 0, and any n and k. 



Intuitively, the reason that pull protocols are efficient from a starting state such as the one 
in Theorem 2 is that each pull request has a probability greater than rj of targeting a user who 
can service the request. 

The next theorem gives an upper bound on the completion time for a pull protocol. 

Theorem 3 Consider a network with n users and k pieces, each initially present at one user, 
implementing a pull-based protocol for piece dissemination. If T is the first time that all users 
have all pieces, then given any 5 > 0, any c > 0, n large enough, and k arbitrary, 

P [T < 4e (1 + 5) (A; log /c + (1 + c)k logn) ] > 1 - 2n~ c 

If k grows at most polynomially in n, then klogk is O(klogn), and Theorem 3 implies an 
upper bound of 0{k\ogn). Thus, Theorems 1 and 3 together show that the completion time 
for any one-sided pull-based protocol is ©(/clogn) with high probability. 

One-sided push-based protocols are those where all communication occurs only via pushes, 
and piece selection is one-sided. An example of such a protocol is the following. 



RANDOM PUSH: In each slot each user pushes a random piece from the set 
of pieces it possesses. 



Unlike pull-based protocols, some push-based protocols may never reach completion. For 
example, if pieces are pushed in a strict predefined priority order by all users, then the spread of 
lower priority pieces may be suppressed by the spread of the higher priority ones. However, other 
push-protocols such as RANDOM PUSH eventually reach completion. The following theorem 
shows that one-sided push-based protocols are slow in the final stages of dissemination. 

Theorem 4 For < f3 < 1, from an initial state in which (3k pieces are each absent in -^-^ 

users, let T 13 be the time taken till all pieces are present in all users, using any push-based 
protocol. Then, for any e > 0, 

P [T p > f3(l-e)k\ogn] > 1 - n~ c 

for all c > 0, k at most polynomial in n, and n large enough. 

Since the completion time grows linearly in the product of f3k and logn, Theorem 4 shows 
that purely push-based protocols provide no speedup from file splitting. 

We now outline how the above results motivate the design of the hybrid efficient one-sided 
protocol INTERLEAVE. Theorems 1 and 4 show that any protocol relying on only one of the 
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push or pull mechanisms can provide no speedup from file splitting. Also, pull protocols are 
inefficient near the start but efficient in the end, while push protocols are inefficient in the end. 
This indicates that it may be possible that a hybrid protocol, in which users execute pushes 
and pulls, can ensure efficient completion. 

Theorem 2 shows that, from the viewpoint of achieving 0(k + log n) dissemination time, an 
intermediate state - in which each piece is present in r/n users for some 77 > independent of 
n - is of fundamental importance. From such a state, user pulls in any hybrid protocol would 
enable completion in 0(k + logn) time. To design a hybrid protocol whose overall completion 
time, from start to finish, is also 0{k + logn), we would thus need to 

(a) design a push protocol that reaches the intermediate state in 0{k + logn) time, and 

(b) combine the above push protocol with a pull-based protocol in a decentralized way. 

This is the idea underlying the design of the efficient protocol INTERLEAVE. 

Working towards the first of the above two objectives, we notice that while Theorem 2 
guarantees the efficiency of pulls in the end, there is no analogous theorem for the efficiency 
of push protocols in the beginning. In particular, some push protocols may not reach the 
intermediate state in 0(k + logn) time. We thus need to design a push protocol for this 
objective. Consider the one-sided push-based protocol PRIORITY PUSH. 



PRIORITY PUSH: 

• pieces are numbered 1,2,... 

• in each slot every user other than the source transmits a copy of 
the highest numbered piece it has received until that time. 

• The source transmits piece number i in the time slots (i — 1)1 + 
1, . . . ,il, such that I > 1 is an integer called the spacing of the 
protocol. 



Theorem 5 Given any 5 > and < c < 1, if the PRIORITY PUSH protocol with spacing 
/ is implemented, the probability that a given piece p reaches n(l — e~ l — 5) users within time 
(1 + 5) log 2 n after leaving the source is at least 1 — 3n~ c for large enough n. 

Before we proceed with the design of an efficient hybrid one-sided protocol, we briefly 
comment on the use of PRIORITY PUSH for the case when the source has the ability to 
generate pieces that are forward-erasure-coded versions of the original file pieces. With forward 
erasure coding, each user now only needs to build a large enough set of distinct coded pieces to 
be able to recover the original file. A protocol based on a combination of rateless forward error 
correction at the source, as proposed for example in [17], and piece relay within the network 
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using PRIORITY PUSH, would work as follows: the source pushes a new coded piece in every 
time slot, which is labeled with a piece number as required by the PRIORITY PUSH protocol. 
A user at any time would transmit the highest numbered (coded) piece it currently possesses. 
PRIORITY PUSH ensures that each user receives approximately 63.2% of the coded pieces 
emerging from the source, and each of these pieces is received approximately log 2 n time slots 
after it emerges from the source. This means that each user can to build up a large enough 
collection of coded pieces in a timely fashion, enabling the decoding of the source file. Such a 
combination of source coding and PRIORITY PUSH may be a good candidate in scenarios such 
that two-way communication between users is impossible or infeasible, because in this case the 
pulling of pieces by users would not be possible. The delay guarantee provided by PRIORITY 
PUSH means that it might also be a good candidate for the relaying of source-encoded data 
that is of a streaming/real-time nature. 

Turning to the design of an efficient hybrid one-sided protocol, observe that PRIORITY 
PUSH (with 1=1) manages to deliver almost every piece to about (1 — e _1 )n nodes. This 
makes it a suitable candidate for combining with a pull protocol to finally generate a hybrid 
protocol. The fact PRIORITY PUSH tends to deliver pieces with lower numbers before pieces 
with higher numbers, suggests that a good pull protocol to combine with PRIORITY PUSH 
is the SEQUENTIAL PULL protocol. The protocols are combined by having users alternate 
between pushing and pulling. The performance guarantee of PRIORITY PUSH is more fragile 
than that of SEQUENTIAL PULL, and for this reason the hybrid protocol INTERLEAVE is 
designed so that the pulling does not interfere with the pushing. The protocol is described 
below. 



INTERLEAVE: 

• Pieces are numbered 1,2,... 

• In every odd time slot, the source pushes the piece with number 
one higher than the one it transmitted in the previous odd time 
slot. Every other user pushes the highest numbered piece it re- 
ceived in the previous odd time slots. The user may have a higher 
numbered piece obtained in an even time slot, but this is not the 
one chosen for transmission. 

• In every even time slot, every user sends a pull request for the 
lowest numbered piece it does not already have. In this slot users 
do not distinguish pieces based on whether they were received in 
even or odd time slots. 



Theorem 6 If T kl is the time INTERLEAVE takes to disseminate the k\ lowest numbered 
pieces, then given any s < \ we have that 

P [T kl < 9h + 2(1 + e) log 2 n] > 1 - 5n" s 

for any e > 0, k\ at most polynomial in n, and n large enough. 
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The above theorem implies that, with high probability, INTERLEAVE achieves complete 
dissemination in time that is within a factor of nine of what an optimal fully centralized protocol 
could achieve. This means that it will be able to provide a significant file-splitting speedup for 
large networks. 

The fact that users communicate pieces in rough order, and the delay guarantee of Theorem 
6 for the lowest numbered pieces, suggests that users receive lower numbered pieces before higher 
numbered ones. This indicates that INTERLEAVE, or protocols of a similar design, would be 
useful in peer-to-peer networks in which the data to be disseminated is of a real-time nature. 

It is interesting to contrast the above performance guarantee of INTERLEAVE with the 
single-piece results of Karp et. al. [9]. In that paper, the authors obtain a lower bound of 
fi(nloglogn) on the number of transmissions of the single piece that need to occur for complete 
dissemination, for any protocol relying on random user contacts of the kind studied in our paper. 
However, when there are multiple pieces, protocols can save bandwidth by pipelining across 
pieces. INTERLEAVE manages to do this pipelining in a way that results in 0(n) transmissions 
per piece, which is the optimal order. 

We now move on to consider two-sided piece selection protocols. Users carry out pushes/pulls, 
but have knowledge of the target's current state. We consider an initial state where n distinct 
pieces are present in the system, one in each user. Completion from such an initial state has 
been previously studied, often under the alternate title of "all-to-all communication" . For this 
state, consider the following two-sided pull protocol: 



ADVOCATE: If the user does not already possess the target's initial piece, it 
downloads that piece. Else it pulls a random piece from among those present 
in the target but absent in the user. 



In this protocol each user acts as an advocate for its initial piece. If each user is restricted 
to downloading at most one piece in every time slot, an optimal central protocol takes at least 
n — 1 time slots to complete. The following theorem shows that the ADVOCATE protocol 
completes in time very close to this optimal, with high probability. 

Theorem 7 Starting with each user having one unique piece, the ADVOCATE protocol oper- 
ating under soft constraints finishes in n + O(logn) time with high probability. 

In the above theorem the pre-constant 1 of n is the best possible. The above theorem 
means that for large n the fraction of wasted time slots is negligible. 



4 One-sided Protocols 

In this section we give the proofs of Theorems 1-6, which deal with one-sided protocols. 
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4.1 One-sided Pull-based Protocols 



Since Theorem 1 is a lower bound on completion time, there is no loss of generality in assuming 
soft constraints. The idea behind the proof is to use a probabilistic counting argument to 
provide a lower bound on the number of pull requests needed per piece. Since at most n pull 
requests occur in any given time slot, such a lower bound on the number of requests needed 
yields a lower bound on the dissemination time. 



Lemma 1 Consider a system with soft constraints, and an initial system state such that a 
given piece p is present in only one user. Let A be the number of pull requests for p needed till 
it is present in users. Then given any e > and c > 0, 

P[A < (1 - e)nlogn] < — 

K 

for any k that grows at most polynomially in n, and n large enough. 



Proof of Lemma 1: The probability of success of any pull request increases in the number of 
pull requests occurring strictly before it. It is thus sufficient to assume that the pull requests 
for piece p occur in strict sequence, with no two being simultaneous. For such a sequence, 
Geo(^) pull requests for p are needed before its occupancy goes from itoi + 1 users 2 . Thus A 
is stochastically greater than or equal to the sum B = Geo(i) + . . . + Geo(^). In turn, the 
probability P[B < (1 — e)n\ogn] is shown in Lemma 10 in the appendix to be as small as is 
required by this lemma. ■ 



Proof of Theorem 1: For a given pull protocol, let A p be the number of pull requests for a 
particular piece p until it is present in users. Since there are at most n pull requests in one 
time slot, it follows that for any time t, 



P[T? <t] < P 



A p < — for some piece p 
pk 



nt 



From Lemma 1 we see that, if n is large enough, choosing || = (1— e)n log n yields J2 p P 
This proves the theorem. 



A <r — 

^P Pk 



< 



n 



We now turn our attention to Theorem 2. Here, we assume without loss of generality 
that the system is operating under hard constraints. Any user needs at most k successful pull 
requests until its collection is complete. From the initial state of Theorem 2, the time to the 
next successful pull is always stochastically less than Geo(^) 3 . Each user can thus be shown to 
complete in 0(k) time, and by a union bound the slowest of n users can be shown to finish in 
0(k + logn) time. We now proceed to make the above argument formal. 

2 For any < a < 1, Geo(a) represents a geometrically distributed random variable: P[Geo(a) > m] — 
(1 — a) m for integer m > 

3 The target has the requested piece with probability t], and is not simultaneously targeted by any other user 
with probability (1 - i)"" 1 > \ 
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Lemma 2 For each user i, let Tj be the first time slot that user i has all k pieces, given that 
each pull is successful with probability at least p. Then, for all times t, 



p 


max 2 


\ > t 


< ne- et ( 




l<i<n 







l-(l-p)e 



for < 9 < log-^, and any n and k 

Proof of Lemma 2: Any node would need at most k successful pull requests till its collection 
is complete. For user i, let Xi, . . . ,X k be the differences between successive times at which its 
pull requests are satisfied. Then, using the Markov inequality, for 6 > 0, 

P(T t >t) = P(X! + ...+X fc >t) < -J - 



Now, at each time, the probability that the user's pull is successful is lower bounded by p. This 
means that any Xj is stochastically upper bounded by a Geo(p) random variable, irrespective 
of the other X's. Thus, 

^|AV..AV,]< T3 ^ 
for all 9 such that (1 — p)e e < 1. This gives 

= e - et E[e e ^ + - +x ^ E[e ex * \X X . . . X k ^\ } 
< e -et ( P e<) \ £[ P »Pi+...+4-i)i 

\i-(i-pW) [ 1 



< e 



-o, ( P e " 

1 - (1 - p)e t 



Now, by the union bound, 



k 



p(m f x Tl >t) < ^P(T,>t) < ne-" ( x _ { f_ p y ) 

The lemma is thus proved. ■ 

Proof of Theorem 2: Any pull request is successful with probability at least p = \- Setting 
the RHS of (2) to n~ c gives 

Note that the choice of 9 in (3) trades off between the coefficients of k and n. Choosing 
9 = log(l + p) < log gives 

, (-, logp \ , , 1 + c 
t — 1 — i — -r. r k + - — log n 



log(l + p); log(l+p) 
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Substituting the value of p = | gives (1) and proves the theorem. 



If users are implementing SEQUENTIAL PULL, all that is required for equation (2) to 
hold is that piece number p be present in nn users by time slot number p, instead of being so 
from the beginning. This is because users pull in sequence, and so do not pull piece p before 
time p. Choosing 9 close to 0, with p = -, gives the following lemma, which is used in Section 
4.3. 



Lemma 3 Consider a scenario such that piece % is present in nn users by time slot i, for each 
i e 1, . . . , k, and SEQUENTIAL PULL is implemented. For e > there is a constant M € , such 
that if T is the time till all users have all k pieces, 



T > ( - + e ) k + (l + c)M € \ogn 



< n 



for c > and all n and k. 



Proof of Lemma 3: By the reasoning above, if t depends on 9 as in (3), then P[T > t] < n c 
for p = | and any value of 9 < log y^-. Note now that 

lim \ log ( t~~~ — r~K I — - 

0-o 9 B \1 - (1 - p)e e ) p 

and so given e > there exists a 5 > such that setting 9 = 5 gives t — + e j k+(l+c)M t log n 
where M t — | is a constant that depends on e and grows large as e becomes small. ■ 

For the proof of Theorem 3, we first prove a stochastic upper bound on the number of pull 
requests for a given piece before it is present in en users. We then use this to provide an upper 
bound on the number of requests needed for all pieces to get to en users each. Since at least 
n(l — e) pulls take place in every time slot until this state is reached, an upper bound on the 
total number of pull requests needed provides an upper bound on the time taken for the system 
to reach such a state. For the remaining time to full completion, we use Theorem 2. 



Lemma 4 Let A be the number of pull requests for a piece p until it is present in all users. 
Then, for any c > and any k and n, 

n~ c 

P[A > (Ae)nlogk + 4e(l + c) n logn] < —— 

K 

Proof of Lemma 4: If a user not having p requests it from a user who has p, and no other 
request has the same target, then it is counted as a success. For any time t let N t be the number 
of users who have p at time t, and a t be the number of requests for p in that time slot. N t+1 
is equal to N t plus the number of successes in the a t requests. Note that N t < N t+1 < 2N t , 
because each of the N t users can satisfy at most one request in one time slot. For any 9 > 0, 
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the function f(x) = \ is convex for x > and hence for any two positive numbers a and b if 
b<a<2b then 

1 < --L-(a-b) 



So, we have that 



E 







1 








1 








N t 


= J, a* 



< 



(2b) d + 1 

e 



■E[N t+1 - j\N t = j, a t ] 



r (2j) e+r 

If N t = j, the probability that any one of the a t requests for p is successful is j(i — ^) n_1 > 
Thus the expected number of new users satisfies E[N t+ \ — j\N t = j, a t ] > ^ and hence 





1 




E 


. iV t+i 


N t = j, a t 



1 

- / 



6» a t j 



(2j) 



= 1 



ne 



a t \ 1 



2 e+i 



e n 



a6 



(4) 



where /3 = exp( en2 e fl+1 ). Let ^ denote the entire history till (and including) time t: it contains 
all the numbers a\. . .a t as well as N\ . . . N t . Define the quantity 



M t = 



By (4), 



E[M t+1 \$t] =0ZU. E 



1 




L iV t+i 





< ti \i = Mt 

- N e 



so that (M t ) is a nonnegative supermartingale with respect to Let T a be the number of 
time slots required for an successes. Then the optional sampling theorem and the fact that 
M 1 — N 1 — l imply that 



1 > E 





> E 


'pTZ a *.' 


Ni 







For any number of requests F we now have the following series of inequalities: 



J> > F 



pZZ a <* > pi 



< 



P 1 



< 



n 
J 1 



(5) 



Setting the RHS of (5) to ^ and substituting the value of (3 we get 



2 9+1 e 



-n log k + 



2 e+i 



+ c 



e 



n logn 



The choice of enables us to trade off between the constants of n log k and n log n. Setting 
= 1 proves the statement of the Lemma. ■ 

The above lemma gives an upper bound on the number of pull requests required for any 
given piece before it achieves full occupancy. This can be used to provide an upper bound for 
the amount of time it takes until each piece has occupancy en. This is done in the following 
lemma. 
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Lemma 5 Consider a system with n users, starting with k pieces present in at least one user 
each, implementing any pull-based protocol. Given e > 0, let T e be the first time that each piece 
is in at least en users. Then, for any c > and any n and k, 



' 4e , , , 4e(l + c) , , 
T e > - k log k H f k log n 



1 



< n 



Proof of Lemma 5: Till time T t the number of users who have completed their piece collections 
is less than en and hence there are at least n(l — e) pull requests in each slot. Let A e be the 
total number of attempts until each piece is in en users. Then, for a given time t, the event 
T e > t implies the event A t > n(l — e)t. 

Let A be the number of pull attempts till all users have all pieces, and A v the number of 
pull requests for piece p till it is present in all users. We now have that 



P[T e > t] < P[A e > n(l - e)t] < P[A > n(l - e)t] < ^ P 

v 



A p > 



n(l - e)t 
k 



Where the last inequality is a union bound over all packets. If we choose 

t 



4e 4e(l + c) 

/c log k H log n 



1 - e 



1 



then Lemma 4 implies that P 
proof. 



n(l-e)t 



< r± -^- and hence P[T t > t] < n c , completing the 



We have already seen that pull-based protocols take at most 0(k + logn) time to get 
from a state where each piece is in en users to one in which all users have all pieces. This, in 
conjunction with the Lemma 5, enables us to prove Theorem 3. 

Proof of Theorem 3: Let e = and T e be the first time that each piece is present in at 
least en users. By Lemma 5, 

P 



5 



T e > 4e ( 1 + - ) (A; log A; + (1 + c)Hogn) 



< n 



Also, for large enough n, Theorem 2 yields that 

P[T-T t > 2e5 (k log k + (1 + c) k logn)] < n 
Putting the two equations above together proves the theorem. 



4.2 One-sided Push-Based Protocols 

Proof of Theorem 4: The proof of Theorem 4 about the inefficiency of any one-sided push 
protocol in the final stages is similar to the proof of Theorem 1. Lemma 6 below is analogous 
to Lemma 1 for pull, and is proved in a similar fashion. It leads to the proof of Theorem 4 in 
the same way that Lemma 1 leads to the proof of Theorem 1. ■ 
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Lemma 6 Consider an initial system state such that a given piece p is absent in users. 
Let A be the number of pushes for p needed till it is present in all users. Then given any e > 
and c > 0, for any pull-based protocol, 

P[A < (1 - e)nlogn ] < — 

K 

for k at most polynomial in n and n large enough. 

Proof of Lemma 6: Since we are interested in the number of push transmissions for a given 
piece p, we can assume that they happen in strict sequence. Some of the transmissions occur to 
users already possessing p, and thus are not successful. For 1 < i < let Xi be the number 
of transmissions occurring when exactly i users do not have the piece. It is easy to see that 
Xi ~ GEOM(^), and we are interested in obtaining a lower bound for A = J^i^i- However, 
this is exactly what is done in Lemma 1, and we refer the reader to the proof of that lemma. ■ 

The PRIORITY PUSH Protocol 

In the remainder of this section we describe the classical random gossip process defined in 
[10], give a new concentration result for this process, and use the result to prove Theorem 5, 
regarding the PRIORITY PUSH protocol. The random gossip process concerns n users and 
only one message, which is initially present with one user. Users with messages are called 
informed. In every time slot, every informed user contacts another user chosen uniformly at 
random from the set of all users and sends (pushes) the message to this user, who is then also 
informed. Let Y t be the number of informed users at time t, and call the process (Y t : t > 0) 
the classical gossip process. The initial condition is Y — 1. 

We present a new concentration result for the process Y. Related results are given in 
[10, 11], but the result here is more exacting regarding the time that the message reaches the 
users. Let G(y) — y + (n — y)(l — (1 — ^) y ) for y e [1, n}. For brevity, the dependence of G on 
n is suppressed. Note that for y e {1, . . . , n}, 

G(y)=E[Y t+1 -Y t \Y t = y] (6) 

Define the deterministic sequence (Y t : t e Z) recursively by Y t = for t < 0, Y = 1, and 
Y t+ i = G(Y t ) for t > 0. The following proposition is proved in the appendix. 

Proposition 1 (Deterministic nature of the classical gossip) Let < c < 1, I' e Z +; 

and e > 0. Then for n sufficiently large: 

(l) Y(l+ e) log 2 n-i' > (1 - fK 

(ii) If, also, e < ^jip, then 

P{\Y t - Y t \ < 2*n~ 2e for < t < (1 + e) log 2 n} > 1 - n~ c , 
(m) P{Y (1+e) i og2n > (1 - e)n} > 1 - n-°. 
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Thus, with high probability, the classical gossip process closely follows the deterministic se- 
quence Y t , and reaches n(l — e) users in (1 + e) log 2 n time. 

Proof of Theorem 5: Consider a particular piece p during execution of the PRIORITY PUSH 
protocol, and let the time axis be adjusted so that the source first transmits p in time slot 1. 
Note that the spread of piece p and subsequent pieces is not affected by the spread of pieces 
preceding p. For any time t, let A t be the number of users transmitting p in time slot t. We 
are interested in the process A, and for this purpose all higher numbered pieces are equivalent, 
because they cause similar interference to the spread of p. S, let B t be the number of users 
transmitting higher numbered pieces in time slot t. At any time, a user may be counted either 
in A, in B, or as not transmitting pieces that are numbered p or higher. It is clear that: 

• The process (A t + B t : t > 0) is stochastically identical to Y delayed by one time unit: 
(A t + B t : t > 0) = (Y t _i : t > l) 4 and in particular A t + B t = Y t _ x for each t>l. 

• The process B is stochastically identical to Y delayed by I + 1 time units: Adopting the 
convention that Y t = for t < 0, we have (B t : t > 1) = {Y t -i-\ : t > 1). 

Since A t = (A t + B t ) — B t , the above shows that the process A is the difference of two time 
shifted versions of the classical gossip process. Based on this idea, we shall apply Proposition 1 
to deduce bounds on A t . Each time p is pushed, we call the node that p is pushed to, the target. 
Thus, during the execution of the algorithm, a sequence of targets is generated, consisting of 
random variables that are independent, with each variable uniformly distributed over all the 
nodes. Even though p may be pushed only a finite number of times, we can extend the target 
sequence if necessary, so that it is an infinite sequence of independent random variables, each 
uniformly distributed over the set of all n nodes. 

For some e > (to be chosen later), let T = (1 + e) log 2 n, and define the following events: 

E x = { \A t + B t - Y t -i\ < 2 t ~ 1 n- 2t for 1 < t < T - 1 } , 
S 2 = {\B t - F t _,_i| < 2 t - l - 1 n~ 2t for 1 < t < T - 1 } , 

and let £3 denote the event that the first nl{l — e) targets of the target sequence includes at 
least (1 — e~ l — 5)n distinct nodes. 

By Proposition l.(ii), P[£\ n£ 2 ] > 1 — 2rT c for n large enough. The probability of £ 3 is the 
same as the probability that nl(l — e) balls thrown independently and uniformly into n bins 
cover at least (1 — e~ l — 5)n bins. A standard Poisson comparison argument shows that if e is 
sufficiently small (depending on S), then P[£s] > 1 — n~ c for n large enough. Thus, for such 
e, P[£i fl £ 2 H £3] > 1 — 3n~ c for n large enough. It thus remains to show that, on the event 
^ifl^n £3, message p reaches at least (1 — e~ l — 5)n nodes. 

On the event £\ fl £ 2 , for 1 < t < T — 1, 

\A t - (F t _! - Y t -i-i)\ < \A t + B t - F t _x| + \B t - F t _,_!| < 2*n- 2e 

4 The symbol = denotes equality in distribution 
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which means that 

A t > F t _i - Y t -i-\ ~ 2*n- 2e . (7) 

Summing each side of (7) over 1 < t < T — 1, canceling like terms, and applying Proposition 
l.(i) with I' = 1 + 1 yields 

T-l T-1 

> ( X) F *-i) - 2 ^ _2e > "1 (l - f ) - n 1_e > ^(1 " 4 

t=l t=T-Z 

for sufficiently large n. Now, Xl^T] 1 ^ * s the total number of times p is pushed before time T. 
Thus, on the event £\ H £2, there are at least n/(l — e) pushes before time T, and on the event 
£ 3 , that is enough pushes to reach at least (1 — e" 1 — 5)n nodes. Thus, p does reach at least 
(1 — e~ l — S)n nodes by time T, on the event £ 1 fl £ 2 H £3. The proof of Theorem 5 is complete. 



4.3 INTERLEAVE 

In this section we analyze the performance of INTERLEAVE, and prove Theorem 6. The 
following two observations about INTERLEAVE facilitate its analysis: 

• The pulling does not interfere with the pushing: if the system were sampled only in the 
odd time slots, the pieces pushed would be identical to an alternate system running only 
PRIORITY PUSH. In particular, the pushing of higher numbered pieces is unaffected by 
the spread of lower numbered pieces. 

• Within the pulling in the even time slots, the spread of higher numbered pieces does not 
interfere with the pulling of lower numbered pieces. 

Call a piece failed if it does not reach ^ users within time 2(1 + e) log 2 n of being pushed 
by the source. Note that | < 1 — e _1 and hence, by Theorem 5 with / = 1, the PRIORITY 
PUSH operating in the odd time slots ensures that P[p fails] < n~ c for < c < 1, and n large 
enough. 

For each % £ 1, . . . , k define T % = min{t : each piece in 1, . . . , % present in all users }. We 
are interested in finding an upper bound on any given T\ So for the remaining analysis in this 
section we choose some k\ < k and provide an upper bound on T hl . 



Lemma 7 Given 1 < q 1 < . . . < q m , let £ be the event that {gi, . . . , q m } is the set of all failed 
pieces numbered less than or equal to k±. Then, for any c > and n large enough, 

p \ Tkl > ( 8.8A; 1 + 2(l+ e )log 2 n 

y + 7727(1 + c)(logn + logm) 

such that 7 is a constant independent of 77 



£ 



< 2n 
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Proof of Lemma 7: For each % e 1, . . . , k let T l be as above and let 

f i = max {r , 2i + 2(1 + e) log 2 n} 

Consider any two successive failed pieces qj and qj+i, and let qj +i — qj = I. Then, for each 
1 < s < I — 1, piece number qj + s, which has not failed, is present in at least n(l — e _1 — e) 
users by time T Qj + 2s. Also, after T qj , all users pull pieces numbered qj + 1 or higher. For this 
scenario, Lemma 3 implies that all users obtain all pieces i such that qj < i < qj + \ within 

Q + {q j+1 - qj - 1) + M e (l + c) logn 

pull slots, with probability at least 1 — n~ c , where p = 1 ~ 6 e 1 ~ £ . Choosing e so small that 
i + e < 4.4, this implies that 

fa+i-i < f*+8.8(gr J - +1 -g j )+2M e (l + c)logn 

with probability at least 1 — n~ c . Also, by time T*+ 1_1 all users pull for pieces qj+i, or higher 
pieces if they already have qj+i- By Theorem 3 with k — 1, 

f *+i _ f < 8e(l + 5)(l + c)logn 

with probability at least 1 — n _c . The last two inequalities above imply that 

7^+i < + 8.8(g j+ i ~q 3 )+ 7(1 + c) logn (8) 

with probability at least 1 - 2n~ c , such that 7 = 8e(l + 5) + 2M e . Defining f qo = 2(1 + e) log 2 n 
and q m +i = hi, and summing (8) over all j shows that 

f kl < 8.8k! + 2(1 + e)log 2 n + m7(l +c) logn 
with probability at least 1 — 2mrC c . Replacing c by c + proves the lemma. ■ 

We are now ready to prove the performance guarantee of the INTERLEAVE protocol as 
given in Theorem 6. 

Proof of Theorem 6: Let m be the number of failed pieces in {1, . . . ,ki}. Then, by the 
Markov inequality we have that 

P[m > kl n-] < ^ 

By Theorem 5, if n is large enough the probability piece p fails is less than 3n~ 2s for s < |. 
This means that E[m] < 3kin~ 2s and so 

P[m > k in - s ] < 3n^ s (9) 

Now, in Lemma 7, if m < kin~ s then the fact that k\ is at most polynomial in n implies that 
7717(1 + c) (logn + log to) is o(ki), so for n large enough its value is less than (0.2) A4. So Lemma 
7 yields: 

P[T kl > 9ki + 2(1 + e) log 2 n | to < A^n -5 ] < 2n" s (10) 
Theorem 6 follows from (9) and (10). ■ 
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5 Two-sided Protocols 



Consider the initial system condition where there are n users in the system, each possessing 
a unique piece. The initial piece with user % is denoted by pi. In every time slot each user 
contacts a random other user to request a piece. The piece selection is two-sided, and is made 
according to the ADVOCATE protocol. To prove Theorem 7, we analyze the evolution of the 
system by breaking time up into phases. 

At any time t define a user i's primary collection to be the set of pieces pj such that i 
contacted user j by time t. Note that a piece pj can be in the primary collection of user % even 
if % did not get pj directly from user j. Note that at any time, the primary collections of the 
users are independent of each other. The pieces that user i has that are not in its primary 
collection are said to be in user «'s secondary collection. 



Phase I 

This phase goes from the beginning up until time n±~ 5 , for some fixed < 5 < \. During this 
phase, with high probability, no user contacts the same user twice. This guarantees that all 
users are successful in each slot in this period. 

Lemma 8 For 5 < | ; at time n^~ 5 all users have at least n^~ 5 — 1 pieces each, with probability 
at least 1 — n~ 4S 



Proof of Lemma 8: A user i is said to repeat in time slot s if it contacts a user in slot s that 
it had contacted previously. Given a time t = n^~ s , the probability of a given user repeating 
in any given time slot in 1, . . . , t is less than -. Thus the probability of a given user repeating 

twice or more by time slot t is less than Qt^) 2 ) which is less than Taking a union bound 
over the set of all users, we get that 

t 4 

P[any user repeats at least twice by time t] < n— = n~ AS 

n 2 

Finally, if every user repeats at most once, then every user misses at most one piece by time 
nh' 5 . ■ 



Phase II 

1 r 

This phase continues up until time |. Beyond time n* , the users start repeating contacts 
more often, and hence the technique of Phase I is not applicable. For Phase II, we make use of 
the fact that the sizes of the user's primary collections are large enough to ensure that useful 
pieces can still be found in these primary collections. 



19 



For two users A and B, let Pa and Pb denote their respective primary collections, and S A 
and Sb their secondary collections, at some time t > n±~ s . 



Lemma 9 For any e > if Pb — Pa denotes the set of pieces in Pb but not in Pa then 



\Pr-Pa\ < n(l - e n) e Ul — e) 



\P A \ < n(l - e~)(l - e ) 



Proof of Lemma 9: Each of the users A and B make t contacts in time t. Consider now 
an alternate system where A and B make a random number of contacts N A and N B in time 
t, which are independent and distributed according to Poissonit). Denote the primary and 
secondary collections of the users in the alternate system with a hat. Observe that Pa and Pb 
are stochastically increasing in Na and N B respectively, and so we have that for any x > 0, 



P[\P B -P A \<x] < P 



\P B - Pa\ < x \ N B < t and N A > t 



< 



P[\P B -P A \<x and N B < t and N A > t] 
P[N B < t and N A > t] 
P\\P b -Pa\ <x] 



P[N B < t and N A > t] 
< 4P[5m(n,e'»(l-e _ »)) < x] 

where for the last inequality we have used the fact that for the Poisson-distributed random 
variables N A and N B , P[N A > t] > \ and P\N B <t\>\. 

Setting x = n(l — e~™)e~™(l — e), the Chernoff Bound on the binomial distribution (see 
for example Theorem 4.5 of [18]) gives 



P 



\Pb ~ Pa\ < n(l - e~) e -n(l - e) 



Now, for n^~ s <t<%, 



. t_ . t_ , 

n(l — e n )e » > t(l — e )e 4 > 



i i 

«4 ° 



This proves the first part of the lemma. The second part is proved along similar lines: 



P\\Pa\<x} < P 



\P A \ < X Na < t 



< 2P 



\Pa \ < x 



2P 



Bin(n, 1 — e «) < x 



Setting x = n(l — e «)(1 — e) and using the Chernoff bound as above proves the second part 
of the Lemma. ■ 
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Now, \Pa + Sa\ < t, and hence the second part of Lemma 9 implies that with high proba- 
bility \S A \ <t-n(l- e _ «)(l - e). 

Now, suppose at time t user A contacts user B. If \Sa\ < \Pb — Pa\ then A is able 
to receive a new piece from B (although this condition is not necessary). From the above 
arguments, it follows that if t — n(l — e _ «)(l — e) < n(l — e - »)e _ »(l — e), or equivalently 
if ± < (1 - e)(l - e»), then A can receive a piece from B with failure probability less than 
g e -e /9 p or sma ii enou gh e, it can be shown that all users are successful with probability 
n 2 e~ omn ^ for all times up until time |. 

Thus, in conjunction with the results of the first phase, it can be shown that all users have 
at least | — 1 pieces after | time slots. 



Phase III 

This phase starts from time | and ends at time n. In this phase the primary collections may 
not be large enough to guarantee the existence of a useful piece among every pair of users. So, 
for the third phase, we need to leverage the spread of pieces across users. 

Let A be a set of pieces of size L = 8 logn, and let e = 3 '° g - . A is said to be a "bad" set 
if the number of users having no piece in A at time | is greater than ne. 

For any given user, the probability that it has no piece in A is less than or equal to the 
probability that user does not have a piece of A in its primary collection. This probability 
is further less than or equal to (1 — ^)5, because the user has made | contacts. Further, 
(1 — < . Thus, the number of users who do not have any pieces in A at time | is 
stochastically smaller than a Binomial(n, e~^) random variable. Thus, 

P [given set A is bad] < P Binomial(n, e~ ? ) > ne 

Since there are (^) possible sets of size L, a union bound over all such sets of size L gives 

P [there exists a bad set of size L] < (^j P Binomial(n, e~ 2") > ne 

<- ©(:)•■— 

- l\ (fY n 

(r, . 1 nLe 

< exp L log n + ne + ne log - 



e 2 

The second inequality above was obtained from the relation P[Binomial(n,p) > k] < (^)p k - 

Now, substituting the value of L and e makes the exponent in the last inequality equal to 
8(logn) 2 + 3 logn + 3(logn) 2 — 3(logn)(log(3 logn)) — 12(logn) 2 
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which is less than — (logn) 2 for n > 3. Thus, we have 

P [there exists a bad set of size 8 logn] < e~^ ogn ^ 2 

This means that any user missing at least L = 8 log n pieces fails to obtain a useful piece with 
probability at most e = 3 *° g - . So, at time n, the probability that a given user has 10 logn pieces 
missing is less than P [Binomial^, 3 l °f n ) > 10 logn]. Using a Chernoff bound on the binomial 
distribution (in particular, part 3 of Theorem 4.4 in [18]) it can be shown that the probability 
that a given user has 10 logn pieces missing is less than 2~ 101og ™, which is less than n~ 6 . Taking 
a union bound over the set of users, we can show that all users have at least n — 10 logn pieces 
by time n with high probability. 



Phase IV 

This phase begins at time n and finishes when every user has received every piece. In this phase 
each of the users has at most 10 log n pieces left to finish. Each of these pieces is present in 
at least n(l — e _1 — e) of the network. We can now apply Theorem 2 with k = 10 logn and 
r] = 1 — e _1 — e to conclude that all users finish in n + O(logn) time with high probability. 



6 Simulations 



In this section we investigate the performance of the PRIORITY PUSH and INTERLEAVE 
protocols using simulations. In the system model analyzed in this paper each user has the 
ability to communicate with another user chosen at random from the entire network. A more 
realistic assumption might be to let each user have only a limited view of the network that does 
not change over time. Specifically, we assume that each user has a fixed list of a small number 
of other users, which we shall refer to as its contact list. A user only pushes to and pulls from 
other users in its contact list. Each contact list is generated randomly, independent of other 
contact lists. It remains constant for all time. 

Consider now the case that users implement INTERLEAVE, but in each time slot a user 
communicates with a neighbor chosen at random from within its contact list. The source 
however still pushes pieces in every other time slot to another user chosen uniformly at random 
from the set of all users. Figure 1 displays the observed time taken for k = 1000 pieces to be 
disseminated to n = 500 users, versus the size of the contact lists. From this we can see that if 
each user has a contact list of size 8 or more, the completion time using INTERLEAVE is close 
to 2{k + log 2 n) ~ 2020, which is much better than the 10/c + 2 log 2 n predicted by Theorem 6. 
This difference suggests that the proof technique for Theorem 6 is conservative. 

Besides the overall completion time, we are also interested in the time a typical piece takes 
to get from the source to a typical user. Specifically, if a piece % emerges from the source at 
time t and reaches a user j at time t + d, we say that delay(i, j) = d. The average delay profile 
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Figure 1: The completion times of INTERLEAVE for n = 500 users and k = 1000 pieces, 
versus the size of the user contact lists. 

D(d) is the average of ldei ay (ij)<d over all possible choices of user % and piece j: 

_^ n k 

D ( d ) = Yl 1 delay(i,j)<d 

i=l j=l 

where ldeia y (ij)<ri is 1 if and only if delay(i, j) < d and otherwise. 

Different piece selection protocols have different average delay profiles. Also, for a given 
dissemination protocol, the average delay profile can vary with the size of each user's contact 
list. A delay profile rising further to the left implies, on average, faster dissemination of pieces. 
Figure 2 plots the average delay profiles for k = 1000 pieces being disseminated to n — 500 
users, for different choices m of user contact list size. From the figure we see that users having 
contact lists of size two leads to poor performance, but with four or five contacts the average 
delay profile is comparable to that achieved if each user were to have a complete view of the 
network. 

We now turn our attention to the PRIORITY PUSH protocol. Figure 3 plots the average 
delay profiles for different choices of the spacing if each user has an entire view of the network. 
Recall that the source transmits a new piece every / time slots. The final limiting value of each 
delay profile represents the final fraction of users a typical piece reaches. This is equivalent to 
the fraction of pieces a typical user ultimately receives. As predicted by Theorem 5, a spacing 
of / has a limiting value of approximately 1 — e~ l . 



7 Discussion 

In this work we 

• investigated the speedup achieved in file dissemination by breaking the file into pieces. 
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Figure 2: The average delay profiles of INTERLEAVE for n = 500 users and k = 1000 pieces, 
for different sizes m of user contact lists. 

• investigated the performance of one-sided piece selection protocols, 

• designed the efficient piece selection protocol INTERLEAVE, 

• illustrated why the PRIORITY PUSH protocol would be useful for the relay of source- 
coded pieces, and 

• designed an efficient two-sided protocol for all-to-all exchange. 

We believe that the techniques and results of this work will aid in the understanding of systems 
that involve the decentralized dissemination of large files. We would like to emphasize that the 
dissemination of multiple pieces of data over unstructured networks is significantly different 
from the dissemination of a single piece, but also note that insights gained from single-piece 
dissemination can be effectively leveraged to design protocols for multiple pieces. It would be 
interesting to investigate the spread of multiple pieces in unstructured networks using more 
detailed system models. 



8 Appendix 

8.1 Lemma for Purely Pull-Based Protocols 

Suppose n and k are positive integers. Let Bk = Y2^ =1 Xi, where the Xi are independent, and 
Xi has the Geo(^) distribution. Then Bk represents the number of pulls needed (in the best 
case of sequential pulling) in order for k nodes to acquire a packet initially in one node. 

Lemma 10 Let Sk denote the event Sk = {Bk < (1 — e)nlogn}. (Suppose that (1 — e)nlogn 
is integer valued.) Then P[Sk] < 2 exp(— rT^'^k). 
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Figure 3: The average delay profiles of PRIORITY PUSH for n = 500 users and k = 1000 
pieces, for different values of spacing I. 

For example, if k = then P[S k ] < 2 exp(-n e / logn), or, if k = n 1 "^ 2 , then P[S k ] < 
2exp(-n e/2 ). 

Proof of Lemma 10: B k has the same distribution as the completion time for the coupon 
collection problem, starting from an initial collection of coupons that is missing k types of 
coupons. Thus S k is the event that in a sample of (1 — e)nlogn random coupons, there is 
at least one coupon of each of the k missing types. For the sake of comparison, consider the 
case that a random number N of random coupons is examined, where N is a Poisson random 
variable with mean (1 — e)nlogn. Then P{N > (1 — e)nlogn} > 1/2. (see [18], Exercise 5.13). 
So 

P[success using (1 — e)nlogn coupons] < 2P[success using iV coupons]. 

For the random sample size, the numbers of coupons of different types are independent Poisson 
random variables with mean (1 — e) logn. Thus, a given type is found in the random size sample 
of random coupons with probability 1 — e( - ( 1-e ) logn = 1 — n~^~ e \ Hence, 

P[success using iV coupons] = (1 — n~^~^) k < exp(~kn~^~^) 

The lemma is thus proved. ■ 




8.2 Proof of Determinism of Classical Gossip Process 

Lemma 11 The function G as defined in (6) is a strictly increasing map of [1, n] onto [2 — ^,n], 
and it is Lipschitz continuous with Lipschitz constant 2. Furthermore, 

G(y)>2y(l- y -) (11) 

and 

n-G{y)<{n-y)e-y' n (12) 
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Proof of Lemma 11: Note that G(l) = 2 — ^ and G(n) = n. Differentiating yields 

G'{y) = (1 - - (n - y) ln(l - -)) 

n n 

Now < - ln(l - u) < ^ for < u < 1. Hence for y G [1, n], 

< G'(y) < 1 - (n - 1) log(l - -) < 2 

so that the first sentence of the lemma is proved. Inequality (11) follows from the fact (1 — > 
1 — f t . The function G can be expressed as 

G(i/)=i/ + (n-i/)(l-e- tf/B ) 
where e = (1 — ^) _n , and (12) follows from the fact e > e. ■ 

Lemma 12 Fore >0, if n is sufficiently large, V(i+ e )i og2 n > (1 _ |)^- 

Proof of Lemma 12: If the lemma is true for some e > 0, then it is trivially true for larger 
values of e, so without loss of generality it can be assumed that < e < 0.1. By (11), if 
1 < Y t < f , then F m > (2(1 - § ))Y t . Hence, 



Y t > min 



{K 1 -!))'■!} f -^- 



Let ti = (1 + (0.9)e) log 2 n. The fact ln(l - f) > - jz^ > -(0.35)e yields 

(2 (l — I))* 1 > exp(pn2-(0.35)e]ti) 

= nexp ({[ln2 - (0.35)e](0.9) - (0.35)}e log 2 n) 
> nexp((0.20)elog 2 n) > n 

Thus, Y tl > min{n, f } 



en 



Similarly, if Y t < f, then Y t+1 > |y t . Hence, if t 2 = ti + ln(i)/ln(|), then F t2 > 
min{f (|)^,f} = f . 

Finally, (12) yields that if § < Y t < n, then n - Y t+1 < (n - F^e" 1 / 3 . Hence, if 
t 3 = t 2 + 3(ln(f) - ln(|)), then n - Y t3 < (f) e -^-^ = f . That is, Y t , > n(l - f ). If 
n is sufficiently large, t 3 — ti < (0.1)elog 2 n, so that t 3 < (1 + e) log 2 n, and the lemma follows. 



Lemma 13 Given < c < 1 and < e < let t = |_7elog 2 nJ. Then for sufficiently large 
n, 

P{Y to =2^} > l~\n- c (13) 

|2*-F t | < 2 t n' 3e for < t < t Q (14) 
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Proof of Lemma 13: Although the push transmissions occur in rounds, the selections of 
people can be considered sequentially. If the first 2* selections are distinct, then Y t — 2*. Each 
of these selections is distinct from the ones before it with probability at least 1 — — , so 

P[Y t = 2 t ] >(l--Y > 1 (2 * )2 



n J n 
Hence, taking t — t and using the fact t Q < 7e log 2 n + 2, 

P{Y t = 2* : < t < t Q } = P{Y to = 2*°} > 1 - 4n 14e ~ 1 > 1 - \r c 

for n sufficiently large, and (13) is proved. 

If < t < t - 1, then Y t < n 7e . This fact and (11) imply 

V \ 



2Y t > Y t+1 > 2Y t [l--^) > 2Y t (l-n 7 ^) 



Hence, for < t < t Q , 



2 t >Y t > 2*(1 _ n 7e-l) 7e log 2 n 

> 2 t (l-(n 7e_1 )7elog 2 n) 

> 2*(1 — r?T 3e ) for n large enough 



because lOe < 1. Thus, (14) is proved. 



Lemma 14 For t > 0, P [|r m - G(F t )| > Y t n- 3t \Y t ] < exp(-^— ) 



Proof of Lemma 14: The idea is to apply the Azuma-Hoeffding inequality [18]. In a round 
beginning with Y t informed people, there are Y t selections. Each selection has the potential to 
increase the number of informed people by one. Thus, given Y t , the variable Y t+ i — G(Y tl ) can 
be viewed as the ending value of a martingale with Y t steps, where the interval of uncertainty 
for each step has length one. ■ 

Proof of Proposition 1: If Proposition 1 is true for V = 0, then it is true for any V G Z+, 

because the term V can be covered by using a smaller value of e. Thus, in the proof, we can 
take V = 0. Then Proposition l.(i) is the same as Lemma 12, proved above. Note that if 
Proposition l.(iii) holds for some e > 0, then it also holds for larger e. Thus, Proposition l.(iii) 
can be proved with the additional assumption that e < ij^ 2 . Then, Proposition l.(iii) follows 
easily from Proposition l.(i) and Proposition l.(ii). It remains to prove Proposition l.(ii). 

Let < c < 1 and < e < As in Lemma 13, let t — L^e l°g 2 nj . Let E denote the 
event Eq = {Y to = 2 to }. Lemma 13 implies that P[E ] > 1 — \n~ c , and 

E C {\Y t - F t | < 2*n" 3e for < t < t } (15) 
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For t < t < (1 + e) log 2 n, let E t = {\Y t+1 - G{Y t )\ < 2 t n - 3e }. If E is true and t > t , then 
Y t > Y to = 2 to > n 7e , so by Lemma 14, P[E c t \E ] < exp(-^) for t > t . Therefore, with 

E = E f| (f\<t<(l+ e ) log 2 n ^) ' 

(l+e)log 2 n (l+e)log 2 n _ £ 

t = to t = to 

n~ e 1 
< (log 2 n)exp( — ) < -n~ c 

for n large enough. Thus, P[F] = (1 - P[E c \E ])P[E o ] > (1 - ±n~ c ) 2 > 1 - n~ c . 

Let F t = {\Y t - Y t \ < 2 t n~ 2e }. It remains to show that E C F t for 1 < t < (1 + e) log 2 n. 
Lemma 13. (ii) implies that £ C F C F t for < t < t a . So let t Q + 1 < t < (1 + e) log 2 n. Let 
G fc denote the composition of the function G with itself k times. Expressing Y t as a telescoping 
sum yields 

t-i 

Y t = CP-^YJ + ^-HYi+i) ~ G t - l -\G{Y l )) 

i=t 

and the definition of Y t yields Y t = G*- to (Fj. Thus, 

t-i 

|^ - F t | < |G*-*°(y to ) - G^(F to )| + £ IG*-*" 1 ^!) - G^-^G^))! 

i=t 

Now G k is Lipschitz continuous with Lipschitz constant 2 k . On the event E, by (15), \Y to — Y to \ < 
2 to n~ 3e and, because E C £ t , |y t+ i - G(Y t )\ < 2*n~ 3e for t < t < (1 + e) log 2 n. Therefore, 

t-i 

\Yt-Y t \ < 2 t - t °{2 t °n- 3e + ^2*^ 1 2V 3e 

j=t 

< 2*n~ 3e + (i - t )2*- 1 n- 3e < 2*n" 2e , 

assuming n is so large that (log 2 n)n~ e < 1. Therefore, if n is sufficiently large, E C F t for 
< t < (1 + e) log„. This establishes Proposition l.(ii), and the entire proposition is proved. ■ 
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